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Abstract. Functional brain connectivity, as revealed through distant 
correlations in the signals measured by functional Magnetic Resonance 
Imaging (fMRI), is a promising source of biomarkers of brain patholo- 
gies. However, establishing and using diagnostic markers requires prob- 
abilistic inter-subject comparisons. Principled comparison of functional- 
connectivity structures is still a challenging issue. We give a new matrix- 
variate probabilistic model suitable for inter-subject comparison of func- 
tional connectivity matrices on the manifold of Symmetric Positive Def- 
inite (SPD) matrices. We show that this model leads to a new algorithm 
for principled comparison of connectivity coefficients between pairs of re- 
gions. We apply this model to comparing separately post-stroke patients 
to a group of healthy controls. We find neurologically-relevant connection 
differences and show that our model is more sensitive that the standard 
procedure. To the best of our knowledge, these results are the first re- 
port of functional connectivity differences between a single-patient and 
a group and thus establish an important step toward using functional 
connectivity as a diagnostic tool. 

1 Introduction 

The correlation structure of brain activity, measured via fMRI, reveals stable 
inter-subject networks of distant brain regions that can be the expression of 
cognitive function. In particular, some connectivity networks are present in the 
absence of stimuli. They can reveal intrinsic brain activity and are studied in the 
restmg-state paradigm. These structures are of particular interest to study and 
diagnose brain diseases and disorders [T] as they can be used for deep probes of 
brain function on diminished subjects. Not only can they extract medically or 
cognitively relevant markers on subjects unconscious [2J, or with limited coop- 
eration [3] , but they also give information on higher-level cognitive systems that 
are challenging to probe via medical imaging or behavioral clinical tests [4]. 

To use functional connectivity as a quantitative inference tool, principled 
probabilistic comparison of connectivity structures across subjects is needed. 
Unlike with stimuli-response studies used routinely in functional neuroimaging, 
this comparison is challenging, as the underlying description of the signal is mul- 
tivariate: each brain-activity time course is considered relative to others. Uni- 
variate group models, such as random effects or mixed effects, are in general not 



sound as they neglect the strong statistical dependence between parameters es- 
timated from the data. Multivariate techniques have been successfully employed 
to single out outlying subjects [5], but have met little success: their results are 
difficult to interpret as they do not point to specific localized differences. 

In this paper, we focus on the description of brain functional-connectivity 
using inter-regions correlation matrices. We first review the current practice 
in inter-subject functional covariance comparison and recall some results on the 
manifold of covariance matrices. Then, we introduce a probabilistic model at the 
group level for the different subjects' correlation matrices, and a corresponding 
algorithm to detect connectivity differences in a specific parametrization of the 
covariance matrices, as correlations are a form of covariance. We quantify on 
simulated data the performance of this detection. Finally, we apply the model 
to the individual comparison of the connectivity structure of stroke patients to 
a group of healthy controls, and show that it outperforms the current practice. 

2 State of the art 

2.1 Problem statement: comparing functional brain connectivity 

We consider S subjects, represented by the correlation matrices between brain- 
activity time series extracted from n ROIs: {S s e R nx ™,s = 1...S}. The 
challenge is to give a probabilistic description of the population of correlation 
matrices so as to find the significant differences between subjects or groups. The 
current practice in functional-connectivity studies is to compare the coefficients 
of the correlation matrices across subjects (see for instance |3l6j ). This procedure 
can be expressed as a univariate additive linear model on the correlation matrix: 

£ s = £* + dST (f) 

where S* is a covariance matrix representative of the mean effect, or the group, 
and dS s encode subject-specific contributions. 

However, with this description it is difficult to isolate significant contribu- 
tions to dS s . Indeed, for interpretation, some coefficients are zeroed out, eg by 
thresholding a test statistic, as in [3J, which eventually leads to a non positive 
definite matrix, for which it is impossible to write a multivariate normal like- 
lihood. As a result, the subject-variability description learned on a population 
cannot give probabilistic tests on new subjects. 

2.2 Recent developments on the covariance-matrix manifold 

The mathematical difficulty stems from the fact that the space of SPD matri- 
ces, 5j/m+, does not form a vector space: A, B e Sym+ ^> A — B € Sym^. 
The Fisher information matrix of the multivariate normal distribution can be 
used to construct a metric on a parametrization of covariance matrices [7 and 
thus define Sym^ as a Riemannian manifold that is well-suited for performing 
statistics on covariances [8]. Local differences on this manifold can be approxi- 
mated by vectors of the tangent space: if B is close enough to A, the application: 



4>\ : B — » log(A-2BA-2) maps locally the bipoint A, B € Sym^ x Sym^ to 
Ai € Sym n , the space of symmetric matricefl A convenient parametrization 
of ^ € Sym n is Vec(W) = {V2wi.j, j < i , Wij, i = 1 . . .n} that forms an or- 
thonormal basis of the tangent space [8] . Finally, IIAB-II . = 1 1 log (A 2BA 2-)|| 

gives the intrinsic norm of AB* on the Sym^ manifold, according to the metric 
around point A: the distance between A and B in the manifold. 



3 Matrix-variate random effects model for covariances 

Multi-subject probability distribution for covariance matrices Using the Rieman- 
nian parametrization of Syni£, we describe the individual correlation matrix 
population as a distribution of matrices scattered around a covariance matrix 
representative of the group, X*. As this distribution must be estimated with a 
small number of observations compared to the feature space, we model it using 
the probability density function that minimizes the information with a given 
mean on the manifold, the generalized Gaussian distribution |S]: 

p(X) = ^)exp(-^||X^||| Y (2) 

where a encodes an isotropic variance on the manifold and k is a normalization 
factor. Given multiple observations of X corresponding to individual correlation 
matrices, X s , the maximum likelihood estimate of X* is independent of a and 
given by the Frechet mean of the observations jSJ, minimizing /J||X^j || s<[ . 



Parametrization in the tangent space We express the individual covariance ma- 
trices as a perturbation of the group covariance matrix X*: 

Vs = l...S, X s = s I(dX s ) = S*5 exp(dS s )S*5 ; (3) 

thus, using ©, p(dX s ) - fc'(<T)exp(-^||d£ s ||:n. (4) 

The parameters of Vec(dS s ) follow a normal distribution, with diagonal co- 
variance <t, and the maximum-likelihood estimate of a is given by (7^[ LE = 
\ J2 S II Vec(dS s )|||. The model can thus be interpreted as a random-effect model 
on the parametrization of Vec(dS s ), in the space tangent to the manifold Sym^. 
Assuming that the distribution is narrow on the manifold, ||dS s ||2 <C 1, eq. 
can be seen as the application of the placement function to move a noise dH s 
isotropic around I„ to S* (see [8], section 3.5): 



£* 5 (I„ +dS s )S* 5 . (5) 



1 Note that we do not use the same definition of the mapping as in [817] , as we are 
interested in mapping to Sym n , the tangent space around I n , and not the tangent 
space in A. It extracts a statistically independent parametrization (Eq. Q and (0])). 



Algorithm 1 Estimation of the group model 

1: Input: individual time series X 1 . . . X s . 

2: Output: estimated group covariance matrix £*, group variance a. 

3: for s — 1 to S do 

4: Compute S s <- LedoitWolf(X s ). 

5: end for 

6: Compute S* <— intrinsic mean(S 1 . . . S s ). 

7: for s = 1 to S do 

8: Compute dS* <- £*~ 5 S S S* _ " -I„. 

9: end for 



Model estimation from the data We start from individual time-series of brain ac- 
tivity in selected regions of interest, X g R™ xt . We use the Ledoit-Wolf shrinkage 
covariance estimator [5] for a good bias-variance compromise when estimating 
correlation matrices from t time points with n < t < n 2 . From this estimate 
of individual correlation matrices, we compute the intrinsic mean on Sym^ us- 
ing algorithm 3 of [TU]. Finally, we estimate a from the residuals of individual 
correlation matrices in the space tangent in X* (see algorithm [1} . 

4 Testing pair-wise correlations statistics 

The multivariate probabilistic model for correlations between regions exposed 
above enables us to define an average correlation matrix of a group, as well as 
the dispersion of the group in the covariance matrix space. Thus it can be used 
to test if a subject is different to a control group. However, to aid diagnosis, it 
is paramount to pin-point why such a subject may be different. In the tangent 
space, the parameters dSf j of Vec(dX s ) are mutually independent. We can 
thus conduct univariate analysis on these parameters to test which significantly 
differs from the control group. However, the independence of the parameters is 
true only in the space tangent at the population average X*, of which we only 
have an estimate X*. Thus, to account for projection error, we resort to non- 
parametric sampling of the control population to define a null distribution for 
the parameters dE?j. 

Specifically, we are interested in testing if a difference observed for a sub- 
ject in one of the dSf j can be explained by variation of the control population. 
As the control population is typically small, we generate the null distribution by 
leave one out: for each control subject, we generate surrogate control populations 

5 by bootstrap from the other control subjects and estimate the corresponding 
average covariance X*. We use X* to project all the individual correlation ma- 
trices, including the left out subject, to compute dIJ-j, and we do a one sample 
T test of the difference between dSf^ for the left out subject with regards to the 
resampled control group S. We record the values of this T test as an estimate of 
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Algorithm 2 Coefficient-level tests 

1: Input: individual time series for controls X 1 . . .X s and a patient X fc , p-value p, 

number of bootstraps, m. 
2: Output: Pair-wise p-values pij controlling for the difference in dSij between the 

patient and the control group. 
3: Initialize P°j empty lists, for i,j 6 {I . . . n}, j < i. 
4: for 1 to m do 

5: Choose a surrogate patient § £ 1 . . . S. 

6: Choose a subset S of {1 . . . S}\5 of S surrogate controls. 

7: Compute S* and dS s for s £ S using algorithm [T] on the surrogate controls. 

8: Compute dS s for s = s, using S* and eqn[5] 

9: For all append to P°j the T test comparing dS s ,j for s £ S and for s = S. 
10: end for 

11: Compute S* and dS s for s £ S using algorithm [1] on the complete control group. 
12: Compute dS fc , using S* and eqnJS] 

13: For all compute tij the T test comparing dSf j for s G S and for s — s. 
14: pij = 1 — quantile( tij in P°A 



the null distribution P°- of the T test on the corresponding coefScient between 

the controls and a patient. Finally, we estimate the average covariance E* for 
the complete group of controls and, for each k patient to investigate, we perform 
a T test of the difference between dSfj for the patient and the control group. 
We use P°j to associate a p-value to each coefficient per subject. We correct for 
multiple comparisons using Bonferroni correction: for each patient, \n (n — 1) 
tests are performed. The procedure is detailed in algorithm [2] 

5 Algorithm evaluation on simulated data 

Algorithm [5] relies on approximations of the exact problem for coefficient-level 
detection of differences. In order to quantify the performance of this detection, 
we study Receiver Operator Characteristic (ROC) on simulated data: we draw a 
population of control covariances using eq. ([5]) with the parameters of Vec(dS) 
normally distributed with deviation a . For simulated patients, we add differences 
of amplitude dS to a few coefficients (~ 20) of this variability noise. For S* and 
cr, we use the parameters estimated on real data (section [5]). We investigate the 
performance of algorithm [5] to recover these differences for a variety of parame- 
ters. We observe good recovery for dS > a (Fig[T]), and find that the comparison 
in the tangent space (eq. [S]) outperforms a comparison in R nXTl (eq. [TJ. 

6 Application to post-stroke connectivity reorganization 

Standard clinical scores, such as the NIHSS, as well as fMRI studies can be 
used to assess the consequences of cerebral strokes, but they test specific cogni- 
tive functions and have little sensitivity to higher-order cognitive malfunctions. 
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Fig. 1. ROC curves on data simulated according to the variability model given 
by eqn[5j (a) for different values of patient differences dS. (b) for different values 
of control variability a. (c) for different number of controls. 



Resting-state functional-connectivity is thus a valuable tool to study post-stroke 
reorganization. We apply our model to stroke patients. 

Resting-state fMRI dataset After giving informed consent, ischemic-stroke pa- 
tients, as well as age-matched healthy controls, underwent MRI scanning. Sub- 
jects with existing neurology, psychiatry, or vascular pathologies were excluded 
from the study. 10 patients and 20 controls were scanned during a resting-state 
task: subjects were given no other task than to stay awake but keep their eyes 
closed. 2 sessions of 10 minutes of fMRI data were acquired on a Siemens 3T 
Trio scanner (245 EPI volumes, TR=2.46s, 41 slices interlaced, isotropic 3 mm 
voxels). After slice-timing, motion correction, and inter-subject normalization 
using SPM5, 33 ROIs were defined in the main resting-state networks by in- 
tersecting a segmentation of the gray matter with correlation maps from seeds 
selected from the literature. For each subject, the BOLD time series correspond- 
ing to the regions were extracted and orthogonalized with respect to confound 
time series: time courses of the white matter and the cercbro-spinal fluid, and 
movement regressors estimated by the motion-correction algorithm. Covariance 
modeling was performed on the resulting 33 time series. 

Separating patients from controls with the matrix-variate covariance model. To 
measure the discriminative power of the matrix-variate model introduced in sec- 
tion [3l we test the likelihood of patient data in a model learned on controls. 
Specifically, we evaluate by leave one out the likelihood of each control in the 
model learned on the other controls. We compare this value to the average like- 
lihood of patients in the 20 models obtained by leave one out. We perform this 
comparison both using the group model isotropic on the tangent space (eq. [5]), 
and the group model isotropic in R nx ™ (eq. [TJ. We find that the model in the 
tangent space separates better patients from controls (Fig [2]). 

Detected connection differences We apply algorithm [2] to detect the significant 
coefficient-level differences for each subject. We compare to a similar univariate 
procedure applied to the parametrization in R nx ™ given by eq. ([1]), rather than 
the tangent space. We find that coefficient-level analysis detects more differences 
between ROI pairs when applied on the tangent-space parametrization (Fig [2k). 
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Fig. 2. (a) Likelihood of the controls and the patients in the model parametrized 
in the tangent space, (b) Likelihood of the controls and the patients in the 
model parametrized in M. nxn . (c) Number of coefficients detected as significantly 
different from the control group per patient, for the model parametrized in the 
tangent space, as well as in '. 




Fig. 3. Significant differences on two patients (p < 0.05 uncorrected), repre- 
sented as connections between regions: increased connectivity appears in red, 
and decreased in blue. The lesion, manually segmented from anatomical images, 
is represented in green. ROIs fully covered by a lesion are marked with a black 
cross on the correlation matrix. 



7 Discussion 

Interpretation of the tangent space Projecting on the space tangent to the group 
mean corresponds to applying a whitening matrix learned on the group 

(eq. [5]) that converts the Gaussian process described by the group covariance 
to an independent and identically distributed (iid) process. In other words, the 
coloring of the time series common to the group is canceled out to compare 
subjects on iid coefficients on the correlation matrix. 

Probing neurological processes For certain subjects, both procedures fail to de- 
tect a single connection that makes a significant difference. Indeed, the variability 
of resting-state activity in the control group induces some variability in the pro- 
jection to the tangent space. For patients with small lesions, this variability is 



larger than the univariate differences. On the other hand, for patients with im- 
portant lesions, the functional connectivity analysis reveals profound differences 
in the correlation structure that reflect functional reorganization. Some express 
a direct consequence of the lesion, for example when the gray matter in one 
of two ROIs has been damaged by the lesion, as can be seen on Fig [3£i. Oth- 
ers reflect functional reorganization. For instance, patient 10 has a right visual 
cortex damaged by a focal lesion, but the analysis shows increased connectivity 
in his left visual cortex (Fig Eb) . Functional connectivity analysis thus reveals 
modifications that go beyond the direct anatomical consequences of the lesion. 

8 Conclusion 

We have presented a matrix-variate probabilistic model for covariances, and 
have shown that it can be expressed as a random effect model on a particular 
parametrization of the covariance matrix. The ability to draw conclusions on 
the connectivity between pairs of regions is important because it is a natural 
representation of the problem. We applied this model to the comparison of func- 
tional brain connectivity between subjects. We were able to detect significant 
differences in functional connectivity between a single stroke patient and 20 con- 
trols. A controlled detection of network-wide functional-connectivity differences 
between subjects opens the door to new markers of brain diseases as well as new 
insights in neuroscience, as functional connectivity can probe phenomena that 
are challenging to access via stimuli-driven studies. 
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